Model-Based Imitation Learning Using Entropy Regularization of Model and Policy
نویسندگان
چکیده
Approaches based on generative adversarial networks for imitation learning are promising because they sample efficient in terms of expert demonstrations. However, training a generator requires many interactions with the actual environment model-free reinforcement is adopted to update policy. To improve efficiency using model-based learning, we propose Entropy-Regularized Imitation Learning (MB-ERIL) under entropy-regularized Markov decision process reduce number environment. MB-ERIL uses two discriminators. A policy discriminator distinguishes actions generated by robot from ones, and model counterfactual state transitions ones. We derive structured discriminators so that efficient. Computer simulations real experiments show achieves competitive performance significantly improves compared baseline methods.
منابع مشابه
Model-Free Imitation Learning with Policy Optimization
In imitation learning, an agent learns how to behave in an environment with an unknown cost function by mimicking expert demonstrations. Existing imitation learning algorithms typically involve solving a sequence of planning or reinforcement learning problems. Such algorithms are therefore not directly applicable to large, high-dimensional environments, and their performance can significantly d...
متن کاملModel-based Adversarial Imitation Learning
Generative adversarial learning is a popular new approach to training generative models which has been proven successful for other related problems as well. The general idea is to maintain an oracle D that discriminates between the expert’s data distribution and that of the generative model G. The generative model is trained to capture the expert’s distribution by maximizing the probability of ...
متن کاملProbabilistic model-based imitation learning
Efficient skill acquisition is crucial for creating versatile robots. One intuitive way to teach a robot new tricks is to demonstrate a task and enable the robot to imitate the demonstrated behavior. This approach is known as imitation learning. Classical methods of imitation learning, such as inverse reinforcement learning or behavioral cloning, suffer substantially from the correspondence pro...
متن کاملcost benefits of rehabilitation after acute coronary syndrome in iran; using an epidemiological model
چکیده ندارد.
mortality forecasting based on lee-carter model
over the past decades a number of approaches have been applied for forecasting mortality. in 1992, a new method for long-run forecast of the level and age pattern of mortality was published by lee and carter. this method was welcomed by many authors so it was extended through a wider class of generalized, parametric and nonlinear model. this model represents one of the most influential recent d...
15 صفحه اولذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE robotics and automation letters
سال: 2022
ISSN: ['2377-3766']
DOI: https://doi.org/10.1109/lra.2022.3196139